Modeling of Pronunciation, Language and Nonverbal Units at Conversational Russian Speech Recognition
نویسندگان
چکیده
The main problems of a conversational Russian speech recognition system development are variability of pronunciation, free word-order in sentences and presence of speech disfluencies. In the paper, pronunciation variability is modeled by creation of multiple word transcriptions. A syntacticstatistical language model that takes into account long-distant word dependencies is proposed for Russian language modeling. Also in this paper the results of analysis of such speech disfluencies as artefacts and filled pauses, which were extracted during segmentation of the Russian speech corpus, are presented. The recognition accuracy of nonverbal elements in the collected corpus was 87%. The proposed methods of pronunciation variability modeling and syntactic-statistical language model creation were realized in the software complex for Russian speech recognition. The performed experiments with large vocabulary using syntactic-statistical language model showed that word error rate of the system was 33%.
منابع مشابه
Code-Switching speech recognition for closely related languages
This work presents an approach to recognition of multispeaker conversational speech with code-switching between Ukrainian and Russian languages. Both inter-sentential and intra-sentential code-switching is handled. The approach takes into account peculiarities of phonetic systems of the closely related Russian and Ukrainian languages. A crosslingual LVCSR system is developed. The acoustic model...
متن کاملPronunciation Modeling for Large Vocabulary Speech Recognition by Arthur
The large pronunciation variability of words in conversational speech is one of the major causes of low accuracy for automatic speech recognition (ASR). Many pronunciation modeling approaches have been developed to address this problem. Some explicitly manipulate the pronunciation dictionary as well as the set of the units used to define the pronunciations of words. Others model the pronunciati...
متن کاملEnhanced tree clustering with single pronunciation dictionary for conversational speech recognition
Modeling pronunciation variation is key for recognizing conversational speech. Rather than being limited to dictionary modeling, we argue that triphone clustering is an integral part of pronunciation modeling. We propose a new approach called enhanced tree clustering. This approach, in contrast to traditional decision tree based state tying, allows parameter sharing across phonemes. We show tha...
متن کاملModeling Systematic Variations in Pronunciation via a Language-Dependent Hidden Speaking Mode
This paper describes the research efforts of the “Hidden Speaking Mode” group participating in the 1996 summer workshop on speech recognition. The goal of this project is to model pronunciation variations that occur in conversational speech in general and, more specifically, to investigate the use of a hidden speaking mode to represent systematic variations that are correlated with the word seq...
متن کاملModeling Systematic Variations in Pronunciation
This paper describes the research efforts of the “Hidden Speaking Mode” group participating in the 1996 summer workshop on speech recognition. The goal of this project is to model pronunciation variations that occur in conversational speech in general and, more specifically, to investigate the use of a hidden speaking mode to represent systematic variations that are correlated with the word seq...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJCSA
دوره 10 شماره
صفحات -
تاریخ انتشار 2013